stanfordfandomcom-20200214-history
Django models
Overview Django is a complete web application framework built on the Python programming language. It was originally developed to meet the needs of a Kansas-based online news room, but has been open-source since 2005. It is generally considered to be the Python equivalent of Ruby on Rails, and follows a similar structure and patterns (in particular the MVC pattern). It is made up of several loosely coupled modules, including a page templating system, an out-of-the-box admin interface, and a models framework. In this paper we focus on Django's model framework. This framework allows a web-app developer to specify the app's key data representations ("models") entirely in concise Python code. The framework then provides the developer with a simple automatically generated API for accessing these models, and abstracts away all corresponding data-base operations. Sample application The following example shows the use of models for a simple discography web app. It uses the following structure: * An artist has a stage name, genre, and a headshot. * A producer has a name, a street address, a city, a state/province, a country, and a Web site. * An album has a title and a release date. It also has one or more artists and a single producer. We first define our models in the Python file models.py: from django.db import models class Producer(models.Model): name = models.CharField(maxlength=30) address = models.CharField(maxlength=50) city = models.CharField(maxlength=60) state_province = models.CharField(maxlength=30) country = models.CharField(maxlength=50) website = models.URLField() class Artist(models.Model): stage_name = models.CharField(maxlength=30) genre = models.CharField(maxlength=30) headshot = models.ImageField(upload_to='/tmp') class Album(models.Model): title = models.CharField(maxlength=100) authors = models.ManyToManyField(Artist) producer = models.ForeignKey(Producer) release_date = models.DateField() In general, each model corresponds to a database table, and each attribute corresponds to a database column. Moreover, Django assumes that the attribute names and corresponding database column names are the same. There is a one-to-many relation between producers and albums, and a many-to-many relationship between albums and artists. The model defines the one-to-many relationship using the "ForeignKey" type and the many-to-many relationship using the "ManyToManyField." Django automatically creates a primary key field called id for each model, and automatically creates an additional "join" table to handle the many-to-many mapping between albums and artists. Then we create the corresponding DB tables from the command line: python manage.py syncdb This command generates SQL code to create the tables if they don't exist yet, but will not update the database schema. Now we can execute the usual CRUD commands in Python (explanations are in comments): >> from albums.models import Producer >>> p1 = Producer(name='Producer 1', address='75 Arlington Street', ... city='Boston', state_province='MA', country='U.S.A.', ... website='http://www.producer1.com/') >>> p1.save() #Save the newly created p1 to the DB >>> p2 = Producer(name="Producer 2", address='10 Fawcett St.', ... city='Cambridge', state_province='MA', country='U.S.A.', ... website='http://www.producer2.com/') >>> p2.save() >>> producer_list = Producer.objects.all() #Read all Producers from the DB >>> producer_list , >>> p = Producer.objects.get(name="Producer 1") >>> p.delete() #Delete p1 from the DB >>> p = Producer.objects.get(website="http://www.producer2.com/") >>> p.city = 'London' #Update p2's city field >>> p.save() #Save the changes to the DB Technical points DB integration Django's models framework is essentially a layer of abstraction over the content of the web-app's underlying database. There is very little coupling between Django code and the nature of the actual database: many popular databases are supported (e.g. MySQL, PostGres, SQLite), and the details of the chosen database (login, password, path, type) are only specified in one place, the settings.py file. This makes it extremely easy to swap out the underlying database backend, and also respects the DRY principle. Looking at the code, Django's implementation uses the Python DB API to connect to a given database and execute SQL code. Django itself concentrates on generating the appropriate database-specific SQL code from the higher-level commands given though its API; it also takes care of basic security (e.g. preventing SQL injection attacks by appropriately escaping and quoting field data). Query execution It seems that Django's model framework has two primary concerns: simplicity and performance. It achieves simplicity by providing an intuitive API following the CRUD principle (create, read, update and delete). The developer is never required to write any SQL. Django uses a simple string parser internally to handle moderately complex queries: for example, artists = Artist.objects.filter(last_name__startswith="J") will find all artists whose last name starts with a J. In terms of performance, Django naturally cannot make a given query go any faster, as it relies on the Python DB API and the underlying database. However, it does ensure that only the necessary and fastest queries are run at all, by performing what could be called "lazy querying". For example, consider the following Python code: articles = Article.objects.filter(author="John") specificArticles = articles.filter(year="2008") for article in specificArticles: print article.title The obvious implementation would be to first run a query like "SELECT * FROM articles WHERE author='JOHN'", and then search through the returned set for articles from 2008. However, Django only runs one more efficient query, "SELECT * FROM articles WHERE author='JOHN' AND year='2008'". This is because it only runs a query when the returned data is absolutely necessary (e.g. when we iterate over it). Caching While performing database queries every time a model is touched is fine for web sites with low traffic, it does not scale well. Indeed, databases (and the hard drives which they are stored on) are relatively slow. So caching data in order to avoid hitting the database as much as possible becomes increasingly important as a web site gets more traffic. Django provides some help with its caching framework. It can use various caching backends, most notably memcached (which is both fast and easily scalable). Django can perform certain basic forms of caching (e.g. caching each page on a web site, or the output of specific views) automatically, with almost no code/intervention from the developer (just a few lines in the settings.py file). However, it seems that for a highly dynamic web site such high-level caching would be pretty much useless, and much more granular caching would be important (e.g. caching the output of certain complex queries). In that case Django provides decent APIs, but the bulk of the work is still left to the developer. Comparisons with other frameworks Rails Django handles models in a fundamentally different way from Rails. Models in Django are self-contained entities defined in Python that are independent of the database. The models are simply a description of the object-relational mapping. In Rails, models are implicitly defined by database migrations. The models are then introspected from these migrations. In the case of a many-to-many relationship between classes, Rails requires the user to define an intermediate "join" table with the combined data from the two classes. On the other hand, Django automatically creates this “join table” when a field is defined as many-to-many in a model. Rails has a few advantages relative to Django. Rails was designed with the philosophy of "convention over configuration," while Django requires many settings and environment variables in order to work properly. In order to create a model in Django, the user must edit an elaborate settings file to configure the database, modify an "installed apps" file to activate the model, and implement an object-specific "to-string" method for useful ouput. One of the most significant drawbacks of Django is its lack of support for automatic migration. Rails enables fast and flexible migrations because it tracks only the schema changes between revisions. It also automatically applies these migrations based on the desired revision. The limitations of Django's migration support are discussed in further detail below. These points of preference are relatively minor (except maybe for migrations). It seems therefore that the choice between Django and Rails for a developer boils down to personal preference, in particular for Python or Ruby. Pylons Pylons is a Python web framework that emerged after Rails, and is Django's primary "competitor" in the Python web framework space. Like Django, it utilizes the MVC design pattern, but primarily relies on third-party components and libraries for its implementation. Pylons does not provide its own database abstraction layer for models. Instead, the user must interact with the underlying database through open-source libraries such as SQLAlchemy, SQLObject, or the Python DB API. In terms of being an out-of-the-box solution, Django therefore seems far superior; for a power user, however, Pylon's freer approach could possibly be preferable. Discussion Ease of use One of the key characteristics of Django, and no doubt one of the reasons for its success, is the sheer ease of developing web applications with it. This is due to a few factors: firstly, its use of the Python language, which is easy to pick up and is already popular. Secondly, Django is a fully integrated solution: everything required for managing an application's data and abstractions is provided out of the box, and no external modules are required (to interface with a database, for example). Thirdly, the framework is designed to handle the common case in a simple manner. There is a negative side to this apparent ease of use: Django offers little flexibility, and provides almost no support for many more exotic use cases. For example, defining special SQL table types is not supported (e.g. InnoDB or MyISAM), and the developer has to manually modify the generated MySQL tables to change that. However, this is a relatively rare use-case, and if a developer does need this kind of functionality then it is likely that they also know SQL. Scalability While Django seems primarily designed for building web applications quickly, it does have some key features which help applications built on it scale to high traffic levels. The first is its caching framework, which we have already seen. Caching, if done well, can very significantly decrease the number of hits to the database, which is often the bottleneck for web app performance. Secondly, the architecture of Django is designed with the "share nothing" principle in mind, so the different components of a Django app (the web server, the media server, the database) can be running on seperate servers, and each one can run on multiple servers. This means that in a certain measure scaling problems can be solved by adding more servers. Finally, Django has an advantage simply due to the speed of Python. Recent versions of Python are remarkably fast for an interpreted language, in particular compared to Ruby. Poor migration support As mentioned earlier, Django models are defined in Python and are independent of the database. Thus, the user is responsible for synchronizing the database with the model, essentially providing the same information twice. As a result, making changes to a database schema can be quite cumbersome. For example, to add a new field, Django requires that the user manually (using SQL) add the new database column before updating the model with the new field. These SQL commands can be generated using Django, though, so the user need not be familiar with SQL. The Django handbook provides an elaborate sequential method for the seemingly simple task of modifying the schema. This can be considered a poor abstraction and a limitation of Django because the user should not have to carefully manage migrations. However, Django should not be singled out, as many other popular model frameworks lack good migration support. Relevant links Official Django documentation: http://docs.djangoproject.com/en/dev/ Python DB API: http://www.python.org/dev/peps/pep-0249/ memcached: http://www.danga.com/memcached/ Pylons: http://pylonshq.com/